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' cFFECTS OF TASK INDEX VARIATIONS ON 
TRAINING EFFECTIVENESS CRITERIA 

ABSTRACT 

A feasibility study was undertaken as part of a program tt 'eyelop quantita- 
tive techniques for prescribing the design and use of training systems. As 
the second step in this program, the present study attempted to: (1) refine 
quantitative indices employed during earlier research; (2) conduct laboratory 
research on the effects which task index variations have on ^raining criteria; 
and (3) support the laboratory results with data gathered in the field. 

TWO laboratory investigations and a field study were conducted. In the first 
laboratory study, effects of variations in task indices on skill acquisition 
of a set-up task were examined. In a companion effort » preliminary data were 
collected on relationships between task index variations and performance dur- 
ing transfer of training." In the field study quantitative task index data, 
descriptive of a variety of sonar trainers and sonar trainee tasks, were re- 
lated to ratio estimates provided by instructors on four training effective- 
ness criteria. 

Significant multiple correlations were obtained between task indices and speed 
and accuracy of performance during skill acquisition. Predictor patterns 
changed over time and between criteria. Set-up task speed was predicted early 
in training, while errors made were predicted later during acquisition. Simi- 
lar buc more provisional relationships were found during transfer of training. 
Speed and, in particular, accuracy of performance during transfer bore con- 
sistent relationships to task index values. Support for these general find- 
ings was obtained in the field. Significant relationships were established 
between instructors' judgments of training criteria and trainee subtask index 
values . 

The results continue to indicate that quantitative task index data can be pre- 
dictively related to training criteria. Further development appears warranted. 
Future research should extend the laboratory findings especially for transfer 
of training, and should seek to generalize these results to field settings 
through the collection of performance data. 
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FOREWORD" 

PURPOSE 

The objective of this research project is to develop quantitative indices of 
the characteristics of instructors' and trainees' tasks so that the effective- 
ness of a given amount and type of training on a given task can be predicted. 
The results of this research should Jead to greater accuracy in establishing 
the human performance requirements in a training system, greater accuracy in 
human factors design recommendations, aim improved instructor station design. 

ACCOMPLISHMEhrrS ^ ' - ' 

In the first phase of this research project, the feasibility of an initial set 
of quantitative indices in describing the trainee tasks on three sonar opera- 
tor training devices was demonstrated. 

In addition, the feasibility usiifg quantitative task characteristic indices 
to predict performance was tested by describing the characteristics of track- 
ing tasks appearing in the experimental literature and predicting tracking 
performance. (The AD number for ordering the technical report which describes 
the -first phase from the National Technical Information Service, Department of 
Commerce, Springfield, Va., 22151, is AD 722423.) 

In the second phase of this research project — which this technical report 
describes — the objective was to determine the relationships between systematic 
variations in quantitative task characteristic indices and performance mea- 
sures. This was successfully accomplished by learning and transfer experi- 
ments in the laboratory and a field validation exercise. 

Strong relationships between performance measures and variations in task in- 
dices (representing various configurations of synthetic trainer tasks) were 
obtained. Further, the transfer experiment resulted in data which suggest 
the feasibility of predicting transfer effects from quantitative task indices. 
Finally, the data of the field study validated much of the laboratory data. 

PLANS ^ 

The next phase of this project will investigate the generality of the findings 
in this technical report to a different family of training devices. 




GENE S. MICHELI, Ph.D. 
Human Factors Laboratory 
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. SECTION I 
INTRODUCTION 

One of the most difficult and complex problems confronting individuals 
responsible for training is the design and development of effective training 
devices. In military settings, where complex simulators and trainers often 
provide the basis for instruction, the problem is particularly acute. During 
development of these complex devices, options are nearly always available with 
respect to the design of trainee and instructor stations. Given such options, 
however, there is seldom any solid basis for choosing among them in terms of 
their relative effectiveness. Faced with alternative designs for the trainee's 
station, one finds it hard to specify with confidence those which will prove 
most effective in promoting rapid acquisition of skills and/or positive trans- 
fer to the operational situation. Similarly, given alternatives in design of 
the instructor's station, one may have difficulty in identifying those which 
will enable instructor personnel to function most effectively in carrying out 
their, duties. 

To deal with ^hese and a series of allied training problems, it is essen- 
tial to have cata relating selected parameters of alternative designs to as- 
pects of trainee and instructor performance. If consistent changes in these 
criterion measures could be demonstrated as a function of systematic manipula- 
tion of design parameters, then such information could be used to predict the " 
effects which different console layouts, sequences of operation; etc., might 
have on the trainee's rate of learning or the instructor's level of performance. • 
The ability to make such forecasts would provide sounder bases for a variety of 
training system design decisions including, for example, appropriate degree of 
simulation fidelity, trainee to instructor ratios, and part versus whole train- 
ing. Equally important, accurate forecasts would aid in identifying those de- 
sign tradeoffs which could be made without compromising training effectiveness. 

BACKGROUND 

In spite of the promise inherent in this approach, the methodology required 
for its implementation has been slow in developing. A major obstacle to more 
rapid progress has been the lack of an adequate means for describing alternative 
designs. Essentially^ a set of indices is desired in terms of which different 
design configurations might be scaled quantitatively. Until such indices become 
available, the relationship between alternative design configurations and the 
different rates of learning or levels of performance associated with them can- 
not be meaningfully explored. 

In response to this problem, the Naval Training Equipment Center (NAVTHAajUIPCHN) 
initiated a program of research which was to be executed in a series of phases. 
The primary objectives of the first phase were to compile and to demonstrate 
the feasibility of applying a set of quantitative task indices. This effort, 
which has been described in detail elsewhere (e.g., Wheaton, Mirabella, and 
Farina, 1971) entailed several activities which included: (1) identifying 
design features of training devices which conceivably could be quantified; e.g., 
number of displays and controls and their arrangements; (2) exploring a variety 
of means for their quantification, relying primarily on indices and techniques 
previously developed and reported in the literature; and (3) determining the 
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feasibility of using the assembled indices to quantify some actual training 
devices. To keep the scope of this effort within manageable bounds, concern 
was limited to features of trainee stations found in various sonar training 
devices. In spite of this restriction, however, it was assumed that many of 
the features chosen for quantification would be relevant to other types of 
trainee stations as well as to instructor stations. Application of the indices 
to four^rainee tasks (i.e., set-up, detection, localization, classification), 
as represented in a small number of different devices, was attempted. This 
exercise demonstrated that roost, if not all, of the indices could be used to 
scale quantitatively the extent and manner in which the trainee tasks differed 
within and across devices. 

RESEARCH OBJECTIVES 

As part the larger research program and as a sequel to Phase I efforts 
the present study had three objectives. The first objective was to refine the* 
set of quantitative indices employed during the earlier research, adding new 
descriptors, if possible, while deleting those which proved unsatisfactory. 
The second objective was to conduct an investigation of the relationship between 
variations in quantitative indices and corresponding changes, if^ny, in se- " 
lected criterion measures. This effort was to be conducted in a laboratory 
setting in order to exercise control over other variables not of immediate inter- 
est to the present study. The third and final objective was to determine whether 
support for relationships established in the laboratory could be provided by 
data collected in the field. Such support would increase confidence in the 
validity of the basic methodology—that of using quantitative task index infor- 
mation to forecast" the relative effectiveness of, competing designs. 

The remainder of this report describes the research performed in pursuit 
of the three primary objectives. In the next section. Section II, the method 
of procedure is presented. The presentation starts with a description- of how 
devices were quantified in the field, and proceeds to a discussion of the 
methods employed in laboratory and field validation studies. The results of 
these studies are presented in Section III. In Section IV, the final section, 
the results are discussed in terms of their implications for the prediction of 
training device effectiveness and for future research. 
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SECTION II 

METHOD ' 

The general approach pursued in the current research stemmed from results 
of the previous phase • As already indicated^ the thrust of Phase I was to dem- 
onstrate that alternative design configurations could be scaled quantitatively. 
It remained to be established, however, that such scaling could be predictively 
related to learning and proficiency criterion measures. In order to provide 
such evidence, an approach was adopted consisting of three distinct but inter- 
related activities. Quantification of devices in the .field was continued using 
a revised set of indices. The data obtained during this exercise were then used 
in conductiug a two-pronged validation study consisting of a laboratory and a 
field effort. 

The dual validation effort was felt necessary because of inherent limita- 
tions in either the laboratory or field approach alone. While the laboratory 
approach would facilitate measurement and experimental control, it would re- 
quire generalization to actual field conditions. On the other hand, while the 
field effort would permit direct assessment of the quantitative indices, it 
presented the familiar problem of obtaining performance data under operational 
conditions. By pursuing both avenues it was hoped that their respective weak- 
nesses could be offset. 

QUANTIFICATION OF SONAR TRAINING DEVICES 

Before either validation effort could be initiated, quantitative task index 
data were required on a sample of actual devices. These data were intended to 
provide guidelines for the types and ranges of design characteristics to be 
manipulated in the laboratory. In addition, they w<re to be employed directly 
in the anticipated field validation effort as the pn^dictor variables. Accord- 
ingly, efforts begun during Phase I to apply the quantitative indices were con- 
tinued during the present research. 

Application of the indices was extended to several devices not examined 
during the earlier ^ork. Altogether, 13 different trainee stations were quanti- 
fied including: the 14E10/3 at Quonset Point , Rhode Island; the 14B31B (AQA-1 
and ASA-20 stations), 14E14, and X14A2 at Norfolk, Virginia; the 21A39/2 (0A1283, 
BQR-2C, and BQR-7 stations) at Charleston, South Carolina; and the 14E3, 14A2/C1, 
SQS-26CX, and 21B55' (OA1283 and BQR-2B stations) at Key West, Florida. 

The procedures involved in quantifying these devices have boon described 
at length in an earlier report (e.g., Wheaton, MirabcUa, and Farina, 1971). 
Briefly, instructor personnel familiar with the operation of each device were 
asked to perform and describe in detail all of the primary and contingency ac- 
tions comprising each of four trainee subtasks. These subtasks, found in most, 
but not all of the devices, included set-up, search or detection, localization, 
and classification. The task-descriptive data obtained for each subtask were 
then converted into flow-chart form for more convenient processing. An example 
of one of the types of flow charts gendrated is shown in Appendix A for the 
SQS-26CX set-up subtask. f 

k 

Upon conversion of the task descriptive data to flow-chart form, they were 
analyzed in terms of a variety of quantitative indices. A reduced set of in- 
dices from the total compiled during Phase I was employed in the present research 

3 
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?S«isr a? Si^ ;f ^"^^haracteristic rating scales, were excluded 

Because. (1) they were often difficult to apply obirctively, reouirine a con- 

cSr^^!;°"f.''^*?'u*"*^y*'*' •"^ they jSfJrred in m^ny^^st^ces^o 
characteristics which, although varying across very differ-ut tyocs of devices 
did not appear to reflect readily manijulable design feaSrel (^^g!. the wJrt * 

S ?imrjiriatf«n'i '''''''' excludef either becL? ihey g^ne^- 

J;.^n^ i v*'i»«ion for the present types of devices or because they had been 
fcand from past work to be correlated highly with other descriptors 

nitior*o/!Lh* f""'P^u'f '^^"^^^y ^''^^P^*^ included 17 indices. A brief defi- 
c\lZ :UVll'ioirc:inlT' appropriate. In- 

a. MAIN - defined as the number of responses comprising the main or 
dominant procedural sequence in an operations fUw chart .~ 
Ir^A A r"?' Appendix A, there are 24 of these con- 

trol and display actions (i.e., those connected by solid lines). 

^* ** """^^ °^ responses comprising the auxiliary 

°' J^finXencii procedum "me flow chart, shown in 

^s?:^d''^y^^:e1^tes).'"^°"" °' ^'^'^ f^-*- 



f. E 



c. TA - defined as the total number of responses (actions) comprising 

e. DISP - defined as the total number of different displays referenced 
during performance of a subtask. " ^ rcrerencea 

int^^l^illS'* as the total number of different equipment elements 
interacted with, this index is given by the sum of CON T and DISP . 

of usfL- nf.r^"- ^he relative nrength of the sequence 

of use ar.ng the various controls and displays. As used here it is 
It SIl^^'*"* products of the number of limbs a l:,nk is used, lid 
YomT^ illlT ^'"'^ miiam,. Fowler, Q 

^' ^LlnTit'^i^'' reflecting the percentage of alternative actions 
present in an operation. A score of, . .Oll i^ans that the high est 

ef^h conS^l "^--^^'V''"* °nly '^^.c link out of aJd into 

ca.h control, with the same frequency used for all links." 
(Fowler et al., 1968). / aihrs. 

^* wMrh*;'??***'' V^^^"" ^^^"1*' describing the extent to 

(0*) or a theoretically defined optimum number of times (100%). 
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DEI - a measure oi the effectiveness with which information 
flows from displays via the operator to corresponding con- 
trols. The index yields a dimensionless number representing 
a figure of merit for the total configuration of displays and 
controls (Siegel, Miehle> § Federman, 1962). 

k - m. D%, C%, E% - defined respectively 9$ the number of display, 
control, or combined equipment elements which the operator 
actually employs relative to the total number of such elements 
which are available for use. 

n - q. CRPS, FBR, INFO^ INST - refer xo the frequency with which 
the operator makes various types of responses during performance 
of the task. Included are responses involving manipulation of 
controls (CRPS), securing of feedback (FBR), acquisition of in- 
formation (INFO), as well as those primarily initiated by the 
instructor (INST) . 

The values actually obtained on each of these 17 indices for the 13 trainee 
stations previously listed are presented in Appendix B. Four separate tables 
are presented corresponding to each of the basic trainee subtasks. The index 
data for all four subtasks were used as predictors in the field validation 
effort. The index data obtained for the various set-up subtasks provided guide- 
lines for the laboratory validation effort. 

LABORATORY VALIDATION OF INDICES 

The general approach to laboratory validation was to develop a modularized, 
synthetic sonar trainer, capable of being readily configured into a large number 
of sonar "trainers," varying in design characteristics, but with a common set 
of functions. The trainer was designed to evaluate set-up behavior alone. Other 
subtasks; i.e. , detection, tracking, classification, were excluded because the 
instrumentation necessary was considered beyond the scope of available time and 
resources . 

CONCEPTUALIZATION AND DEVELOPMENT OF THE SYNTHETIC TRAINER. Design of the 
trainer was preceded by an extensive examination and analysis of the task data 
collected during this and the previous phase of our research. Working from l>oth 
the original task-analytic data and derivative flow charts, essential set-up 
functions were identified on a trainer-by-traincr basis- A relatively common 
set of functions; i.e., cutting across all the trainers studied, was generated 
(table 1), These functions are basic activities performed by the sonar trainee 
operator during set>up and are relatively common to all the sonar devices which 
have been explored in this program. Approximately 23 set-up functions were 
identified. Some of these were later combined to yield a reduced set of 19 
functions. For each of these 19 functions, an equipment module was eventually 
designed. 

On a second pass through the devices, displays and controls needed for each 
function were identified. These displays and controls were then collapsed 
across devices, and duplicate units eliminated to arrive at a final, non- 
rodundant set for each function. These sets of equipment elements were the 
basis for designing a module for each of the 19 functions . 



5 
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TABLE 1. SET-UP TASK FUNCTIONS 
IDENTIFIED FROM TASK-ANALYTIC REVIEW 

1. Energize the console 

2. Check gyro status 

3. Activate calibration mode 

4. Select transducer operation modes, e.g., active/ 
passive, ATF/MTB 

5. Select range scale and adjust range cursor 
Adjust PPI intensity/focus for: 

6. Overall scope 

7 . Sweep 

8. Cursor 

9. Adjust audio for comfort level 

10. Adjust console illumination for comfort level 
Insert sonar parameters 

Geo-ref erences : 

11 . True/relative 

12. Speed 

13. Course . 

14. Ship centered display/target centered display 
Other parameters: 

15. Sound velocity 

16. Pulse length/dwell time 

17. Frequency 

18 . Sum/difference 

19. Depression elevation angle 
Calibrate the PPI xe: 

20. Range cursor 

21. Bearing cursor 

22. Sweep 

23. Check signal meters for operation 
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Each module contained displays and controls which duplicated actual hard- 
ware found in the sonar devices » or which represented the essential functions 
of actual hardware. Representative displays and controls were used where the 
coniplexity of actual hardware was beyond the scope of the current effort. For 
example, simple neter movements arranged as voltmeters across a variable voltage 
source were used in place of the PPI. Manipulating this voltage source to 
effect a change in meter reading is somewhat analogous to manipulating a hand- 
wheel to effect changes in the position of a PPI range or bearing cursor. It- _ 
was felt that the essential decision-making and perceptual -motor activity could 
be abstracted via this kind of substitution of hardware, even though the substi- 
tuted version might appear rather different from the actual hardware. Where 
actual hardware consisted of such items as toggle switches, function switches, 
meters ^ and jeweled signal lights, actual hardware was used. 

For most of the modules, a "simple" and a "complex" form was constructed 
to represent simple versus more complex hardware for discharging essentially 
the same function. Altogether, a total of 30 different modules was available 
for combination into a variety of trainer. configurations. 

SELECTION OF TRAINER CONFIGURATIONS. For purposes of the present research, 
an attempt was made to compile a set of configurations which would vary as much 
as possible along the 17 design indices selected for study. Toward this end 
two anchor configurations were initially selected representing extreme designs. 
There was a "complex" trainer consisting of all complex panels and a "simple" 
trsdner consisting of all the simple panels which were available (i.e., simple 
panels were used at all those t»^*=^tions for which simple panels had been con- 
structed). The complex ar. oimple configurations are shown in figures 1 and 2. 
Given the two extreme configurations, an intermediate configuration was then 
generated by randomly selecting either a complex or a simple module for each 
function on the trainer console. .This configuration, known as the medium-all 
trainer, is shown in figure 3. 

In addition to these three primary trainers, nine additional trainers were 
selected to yield a range of design parameter values. These configurations 
essentially represented variatrons-Tii the simple trainer or the medium trainer; 
i.e., the simple trainer embedded in the complex, medium trainer with feedback 
lights removed^ simple trainer with additional contingency responses included 
in the training regimen. These manipulations -were aimed at reducing correla- 
tions among the design parameters, in particular the correlation between number 
of- displays or controls and other design characteristics. 

For each trainer, a specific set of procedures or sequence of responses was 
developed. These served to define "trainee" tasks analogous to the trainee set- 
up subtasks associated with actual sonar training devices. To the extent that 
equipment elements were present on a panel, but not involved in task performance, 
the task was said to be embedded. If a reduced number of feedback lights was 
used, the task was labeled according to those indicator groups which were used 
(i.e., none, every third, all). The 12 tasks which were employed are listed in 
table 2, together with their values on the same set of tnsk indices previously 
applied in the field. 
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TABLE 2. TASK CHARACTERISTIC INDEX 
■ VALUES FOR SYNTHETIC TRAINER TASKS 



Task 


MAIN 


CNTG 


Task 
TA 


Indices 
CONT DISP 


E 


LV 


AA% 




1. Complex Task 
All Indicators 


69 


46 


115 


34 


24 


58 


7591.2 


65 


66 


2. Medium Task 
All Indicators 


50 


34 


84 


27 


19 


46 


5788.8 


68 


73 


3. Medium Task 
Third Indicator 


47 


23 


70 


27 


12 


39 


4922 . 1 


70 


71- 


4. Medium Task + 2 
Third Indicator 
Embedded in Complex 


47 


25 


72 


27 


12 


39 


4922.1 


68 ■ 


68 


S. Simple Task 
All Indicators 


43 


20 


63 


23 


13 


36 


4516.7 


71 


83 . 


6. Simple Task + 6 
Third Indicator 


41 


20 


61 






32 


4125.0 


67 


78 


7. Simple Task + 6 
Third Indicator 
Embedded in Complex 


41 


20 


61 


23 


9 


32 


4125.0 


67 


78 


8. Medium Task + 2 
None 


46 


23 


69 


27 


11 


38 


4722.1 


68 


70 


9. Simple Task 
All Indicators 
Embedded in Complex 


43 


20 


63 


23 


13 


36 


4516.7 


71 


83 


10. Simple Task 
All Indicators 
Embedded in Medium 


43 


20 


63. 


23 


13 


36 


4516.7 


71 


83" 


11. Simple Task 
None 

Embedded in Medium 


40 


12 


52 


23 


8 . 


31 


3728.7 


71 


89 


12. Simple Task 
None 


40 


12 


52 


23 


8 


31 


3728.7 


71 


89 
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•TABLE 2. TASK CHARACTERISTIC INDEX 
VALUES FOR SYNTHETIC TRAINER TASKS 
(Cont) 



U>- ■■ ■ 1:^=: 

Task 


Uhl" 


- 

in 


Task Indices 

c% n^s. 


CRPS 


FBR 


INFO 


INST 


!• Complex Task 
All Indicators 




1 on 


100 


100 


AO 
CO 






1 


2. Medium Task 
All Indicators 


9.7 


100 


100 


100 


44 


19 


21 


6 


3. Medium Task 
Third Indicator 


10.9 


100 


100 


100 


44 


9 


17 


6 


4. Medium Task + 2 
Third Indicator 
Embedded in Complex 


8..0 


50 


79 


67 


46 


9 


17 


6 


5. Simple Task 
All Indicators 


16.3 


100 


100 


100 


35 


12 


16 


6 


6* Simple Task + 6 
Third Indicator 


10.8 


100 




100 


39. 


8 


14 


6 


7. Simple Task + 6 
Third Indicator 
Embedded in Complex 


8.2 ' 


37 


68 


55 


39 


8 


14 


6 


8. Medium Task + 2 
None 


9.9 


100 


100 


100 


46 


7 


16 


6 


9. Simple Task 
All Indicators 
Embedded in Complex 


12.7 


54 


68 


62 


35 


12 


16 ' 


6 


0. Simple Task 
All Indicators 
limbedded in Medium 


14.3 


68 


85 


78 


35 


12 


16 ' 


6 

1 


1. Simple Task 
None 

Embedded in Medium 


17.3 


42 


68 


67 


34 


5 


13 


6 


2* Simple Task 


21.3- 


100 


100 


100 


34 


5 


■ 13 


6 



None 
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EXPERIMENTAL PROCEDURE. Following development of the synthetic trainer 
aiid selection of the specific tasks to be studied, the testing portion of the 
laboratory effort was initiated. Subjects who were to serve as trainees -during 
this portion of the study were recruited from universities in the metropolitan 
Washington, D, C, area. The subjects were males who, on the average, wore 22 * 
years old, 71 inches tall, and weighed 159 pounds. Subjects were randomly as- 
signed in groups of five to each of the 12 experimental tasks. The 60 subjects 
employed in this manner were paid for their services. 

Upon arrival at the American Institutes for Research (AIR), each subject 
was ushered into the laboratory and seated before the experimental console, con- 
figured according to the task group to which the subject had been assigned. 
The following standard instructions were then read: 

The experiment you are taking part in today is part of a 
research program. to study how well and how quickly people learn 
to operate equipment, which is designed in a variety of differ- 
ent ways. Your task will be to learn to operate the equipment 
which is before you. I will go through the operation of the de- 
vice step-by- step with you. I will do this twice, and then I ' 
will ask you to repeat the operations from memory a number of 
times. I will correct errors or omissions which you make, but 
please do your best to recall the operations. Accuracy and speed 
are both important for obtaining valid research data. Following 
each run-through, you will be asked to leave the room so that the 
equipment can be reset. You may wait in the lounge while this is 
being done. Are there any questions? 

Following presentation of these instructions, the subject was given de- 
tailed information on how the task was to be performed. Using a specially pre- 
pared flow chart, similar to that presented in Appendix C for the complex-all 
task, the subject was instructed step-by-step in the procedure to be learned. 
An important aspect* of these instructions concerned the standardized reporting 
language which the subject was to use when describing his task responses. For 
example, instructions for Panel 1 of the complex trainer included the following; 



.INSTRUCTION: 



Set main power 
Switch #1 to 
Standby Position 



Check main power 
Indicator #2 for 
Green Indication 



o 



2 



VERBAL RESPONSE: 



"1 to Standby 



■II 



"2 is Green" 



FRIC 
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Standardized responses were used to minimize the variability inherent in 

the time required for verbalization of behavior. The complete set-up procedure 

was described twice in this manner after which any final questions were answered. 

Following this orientation session, IS experimental trials were administered. 
Preliminary pilot work indicated that performance reached asymptote within this 
nu."'>-er of trials for a prototype trainer. Prior to each trial the subject left 
the testing area and the experimenter set all controls in randomized positions 
according to a predetermined scenario. Programming of the various trainer con- 
figurations' was of the simplest kind. The experimenter preset switches and dis- 
plays either on the trainer itself or on a peripheral control panel. Again, 
the present scope of effort limited the sophistication which could be applied 
to instrumentation. 

Upon being recalled for each trial, the subject went through an entire set- 
up procedure, verbalizing each response which he made. Correct verbal responses 
were preceded on a trial -by-trial basis (for the randomized initial "control set- 
tings) on the experimenter's response sheet. Therefore, measurement of perfor- 
mance consisted of simply checking off ea~h response as it was emitted by the 
subject. Erroneous or omitted responses were so coded. Time to complete each 
run-through was measured with a stop watch. However,- the watch was stopped 
while subject errors were being recorded and corrected. Thus, time, errors of 
omission, and errors of comission provided the dependent measures. 

TRANSFER OF TRAINING PROCEDURE. The primary laboratory validation focused upon 
acquisition of set-up skills. However, as an adjunct to this effort, a pilot 
transfer study was also undertaken. In this effort additional training was pro- 
vided for five of the 12 groups involv^»d in the main study (groups 2, and 9 
through 12 in table 2) . These particular groups were chosen because they pro- 
vided some interesting contrasts; i.e., effect of panel clutter or embedding on 
transfer (ratio of used to unused displays and controls) . Following the regular 
acquisition trials, subjects in these groups were permitted to rest for one-half 
hour. They were then brought back to the laboratory and retrained on the roedium- 
all task. This training regimen was identical to the acquisition regimen; i.e., 
two complete run-throughs. However, only 10 training trials were run rather 
than 15. One of the groups originally trained on "medium all" was not given any 
retraining, but merely tested for retention. Ten trials were also employed for 
this group. 

FIELD VALIDATION OF INDICES 

The second prong of the dual validation attempt involved a study of the 
effectiveness of the 13 sonar training devices which had been previously task 
analyzed. Ideally, such a study should involve carefully controlled measure- 
ment of actual training experiences by novice enlistees. Such a procedure, how- 
ever, would require -considerable interference with on-going training activity 
and normally is nov '-jasible. Therefore, field validatix)n was pursued via 
structured intervie. with experienced sonar instructors. Tliese instructors 
were asked to rate the tasks- trained on their devices against a set of "smhe- 
"sized" comparison tasks. 
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The data collection was undertaken at sites previously employed for 
training device analysis. These included the Fleet Sonar School at Key West, 
Florida, the Fleet Ballistic Missile Submarine Training Center at Charleston, 
South Carolina, and the Fleet Training Center and Fleet Airborne Training Unit 
at Norfolk, Virginia, and the Quonset Point Naval Air Station in Rhode Island. 

At each sonar training device installation visited, a group of four or 
five instructors was convened who were qualified on the device under examina- 
tion. These instructors had the following average experience profile: 

Experience cate^gory Mean number of years 

Total Navy 10.9 

Sonarman at sea S.9 

Sonar instructor 1.9 

Experience on device 1.3 
being rated 

Instructors were assembled in groups in a classroom setting and were given 
a series of instructions. These introduced the background of the project, stated 
the purpose of the current visit, and explained the method which was to be em- 
ployed in making judgments about the particular training device under examina- 
tion. This method required the instructors to compare the set-up, detection, 
localization, and classification subtasks performed on their device against a 
similar set of subtasks associated with a hypothetical sonar trainer. This same 
set of hypothetical subtasks was used as a common frame of reference for all 
groups of instructors. The hypothetical trainer actually represented a dis- 
guised amalgam of several of the devices being studied. 

Following this general orientation, instructors were given detailed in- 
structions about four specific ratio judgments which they were to make. These 
instructions, included in Appendix D, concerned how estimates were to be made 
about: (1) training time; (2) proficiency le^^el; (3) degree of transfer of 
training; and (4) level of cask difficulty. 

Upon completion of the instructions and, after answering any questions, 
instructors were provided with flow charts designed to facilitate their judg- 
ments. Two types of flow charts were used. One set described the subtasks to 
be evaluated and were similar, for instance, to the set-up floV charts included 
in Appendix A. The other set consisted of the standard flow charts which were 
to be used as the frame of reference. These flow charts appeared in Appendix B. 

One subtask wa^ dealt with at a time, starting with set-up and finishing • 
with classification. For a given subtask, the standardized flow chart was dis- 
tributed first, and reviewed step-by-step with the instrukors. Next, the flow 
chart, representing the same subtask in the device to be evaluated, was dif- 
tributed and reviewed in similar fashion. Based upon a comparison of their own 
subtask with the standard, instructors were then asked to provide ratio esti- 
mates on each of the four criterion dimensions, using the response blank shown 
in Appendix F. 
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When evaluations of all four subtasks were completed, a group discussion 
was held to try to arrive at consensus judpients. No attesqpt was nade to force 
consensus, but instructors were encouraged to discuss any disagreements among 
their ratings* Misunderstandings about evaluation procedures were also taken 
up at this time* On the basis of the group discussion, each instructor provided 
a final judgment* That judgment was accepted, no matter how disparate it was 
from any other judgments* 

Following evaluation of all of the subtasks for the actual device, instnic* 
tors were finally asked to make a last series of judgments concerning the rela* 
tive difficulty of the standard subtasks* This time they were to evaluate the 
standard detection, localization, and classification subtasks, using the standard 
set-up subtask as a basis for comparison* Such judgments were designed to pro« 
vide a means for expressing the ratio estimates in terms of a common metric, 
thus permitting direct comparisons across subtasks* 
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SECTION III 



RESULTS 

Three distinct sets of results are presented in this section. The first 
concerns the acquisition data obtained on the synthetic set-up trainer. The 
second set, also based on laboratory research, steas from the pilot transfer 
of training study. Final portions of the results section deal with findings 
from the field validation exercise. 

In the «ajor sections which follow, the same general format is used. The 
basic layout of the data is given first, followed by a brief description of 
general findings. The nore specific analyses are then presented. These are 
primarily in correlational form, attempting to describe the relationship between 
task index variables and a variety of criterion measures. 

LABORATORY FINDINGS - , 

Results of the acquisition and transfer portions of the laboratory study 
are presented in figures 4-11 and tables 3-S. They describe variations in per- 
formance speed and accuracy as a function of synthetic trainer task configura- 
tions. 

ACQUISITION. The basic performance data for acquisition training are shown in 
figures 4-9. In each case either mean performance time (figures 4-6) or mean 
number of errors (figures 7-9) is plotted as a function of trial block with 
task configuration as the parameter. Tlie IS acquisition trials originally ad- 
ministered were collapsed into seven blocks in order to improve stabilitv of 
the data. Thus, each point in these figures represents an average value for 
ten scores (five subjects per trial over two trials). An exception is the final 
block (Tjj^jg) which spans three trials and represents, therefore, 15 scores. 

Figures 4-6 and 7-9 have essentially been broken out from two larger time 
and error composites in order to improve clarity of presentation. Tlie simple- 
third and simple-none configurations provide one grouping (figures 4 and 7). 
The simple-all configurations provide a second grouping (figures 5 and 8), and 
the medium and complex configurations yield a third grouping (figures 6 and 9) . 
TTiese pairs of figures describe mean performance time and mean number of errors 
respectively. 

Viewed in their entirety, all six figures reveal subltantial variance in 
performance across task configurations. This variance islshown most clearly 
for the mean performance times of the simple-nono, simple-all, and simple- 
third groups (figures 4 and S). The medium groups, while contributing to over- 

are fairly homogeneous, especially when compared to the complex- 
all configuration (figure 6). Variation across tasks in terms of error scores, 
though somewhat less dramatic, is still marked (figures 7-9). This is again 
particularly true for the simple-third + 6 and simple-none tasks (figure 7). 
Demonstrable variance in both the time and error criterion measures was, of 
course, a prerequisite for the anticipated correlational analyses. 
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Figure 4 Mean perforaance time as a function of trial block durinc 
acquisition training for simple-third and siaple-none tasks 
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Figure 5. Mean performance time as a function of trial block during 
acquisition training for $imple*all tasks 
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Coaplex, All 
O Mediua, Third 



O— — O MediuB, Third ♦ 2 

Eabedded in Complex 
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Figure 6. Mean performance time as a function of trial block during 
acquisition training for medium and complex tasks 
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Figure 7. 



Mean errors as a function of trial block during acquisition 
training for simple-third and simple-none tasks 
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Figure 8, Mean errors as a function of trial block during acquisition 

training for simple-all tasks 
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Figure 9. Mean errors as a function of trial block during acquisition 
training for medium and complex tasks 
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ERIC 



Closer inspection of both sets of data shows that learning occurred on 
all tasks. The training regijuen brought about a consistent reduction in the 
time required to perform each task as well as in the number of errors made. 
In the case of the "simpler" tasks, time and error scores appear to be reach- 
ing asymptotic levels (figures 4 and 5, 7 and 8) . On the medium and complex 
tasks, however, continued improvement is still noticeable (figures 6 and 9). 

It is of interest that in both the time and error data there are two 
apparent sources for the observed differences among the various plots. The 
first is related to type of task , while the second- involves task embedding . In 
this connection task refers to a specific set of procedural responses performed 
in a prescribed sequence. Embedding refers to the degree to which all of the 
displays and controls available for use are indeed used during task performance. 

. Variation in performance time due to type of task is clearly seen when the 
simple-none, simple-all, and simple-third + 6 plots are compared (figures 4 and 
5). The consistent ordering in perfoxmance time throughout acquisition holds 
up for all task types with the single exception of the medium-none + 2 task 
(figure 6). With respect to error scores, the clearest consistent difference 
IS seen_ between the simple -none. and simple-third + 6 tasks (figure 7). 

Particularly noteworthy are the different levels of performance associated 
witj task embedding. ^ For example, time (figure 4) and errors (figure 7) are 
both greater for the '.embedded versions of the simple-third + 6 and simple-none 
tasks. For simple-all tasks, this relationship holds only with respect to the 
time measures which increase as a function of degree of embedding (figure 5). 
With only two reversals, the differences in performance associated with task 
embedding are maintained throughout acquisition.' The amount of training pro- 
vided, although reducing the initial spread among these groups, is insufficient 
~- to eliminate the effects of extraneous displays and controls. This finding is 
made all the more interesting by the fact that performance for these simple 
task groups appears to be reaching an asymptote (figures 4, 5, and 8). The re- 
lationship is not as clear in the case of the medium-third task, which behaves 
as the simple embedded tasks do with respect to error (figure 9) , but shows the 
opposite relationship for time (figure 6). ' 

In much of the criterion data just described, relationships are strongly 
implied between performance during acquisition and the type of task to which 
subjects are exposed. The fairly consistent ordering of tasks with respect to 
performance level directly raises an issue of basic concern 'in the present 
research. To what extent are the indices, descriptive of the various trainer 
configurations, related to criterion performance? The Pearson product -moment 
correlation coefficients shown in tables 3 and 4 bear on this issue. 

As shown in table 3, correlations of task indices with mean performance 
time at each trial block are, in general, highly consistent. With the excep- . 
tion of three variables (D%, C%, and E%), all reported coefficients are sig- 
nificant (p <.05). The three exceptions are in themselves interesting be- 
cause of the consistently small correlations which they exhibit across all 
seven trial blocks. The same general pattern of relationships is also found 
m the mean error data reported in table 4. D%, C%, and E% fail to correlate 
substantially with mean error at any of the trial blocks. All other- indices 
do exhibit substantial correlations with the error criterion. With the excep- 
tion of the AA% and DEI indices, however, the correlations with error are neither 
as strong nor as consistent as they were with the performance time criterion. 
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TABLE 3. INTERCORRELATIONS OF TASK INDEX VALUES AND MEAN 
PERFORMANCE TIMES ACROSS TRIAL BLOCKS FOR THE LABORATORY TASKS''" 



'"^'^ Tr.ial Blocks 

Indices 





J 


2 


3 


4 


5 


6 


7 


MAIN 


73 


81 


83 


86 


88 


94 


88 


CNTG 


78 


82 


84 


86 


90 


91 


86 


TA 


77 


82 


85 


87 


91 


94 


88 


CONT 


66 


72 


80 


83 


87 


90 


83 


DISP 


65 


71 


68 


72 


76 


83 


74 


N 


69 ' 


75 


77 


81 


84 


90 


82 


LV 


74 


80 


81 


84 


88 


92 


85 


AA?5 


-75 


-75 


-85 


-80 


-83 


— / J 


— / y 


\-% 


-65 


-62 


-75 


-76 


-83 


-72 


-71 


Dli] 
xlO"'' 


-83 


-79 


-86 


-85 


-89 


-77 


-79 


D% 


-06 


-01 


06 


08 


13 


17 


18 


C% 


-04 


-01 


08 


10 


15 


17 


17 


li% 


-12 


-06 


03 


05 


10 


14 


15 


CRPS 


73 


77 


87 


88 


92 


90 


87 


FBR 


70 


76 


69 


73 


75 


• 82 


76 


INFO 


72 


79 


79 


83 


85 


92 


84 


INST 


71 


81 


80 


81 


79 


90 


86 



^Decimal points Ijavc been omitted from coefficients for clarity. 
With 10 degrees of freedom: r ^ .708, p ^ .01 

.V .576, p ^ .05 
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TABLE 4. INTERCORRELATIONS OF TASK INDEX VALUES AND MEAN 
ERRORS ACROSS TRIAL BLOCKS FOR THE LABORATORY TASKSt 





Task 
Indices 






Tri •! 1 




H.>^,^»^, .... ^ 








1 


2 


3 


4 


5 


6 


^7 


M/\1N 


59 


28 


41 


57 


48 


73 


18 


CNTG 


65 


46 


58 


69 


66 


86 


36 




o3 


39 


51 


64 


59 


81 


28 


CONT 


46 




4ft 


61 


S3 


69 


17 


DIS1> 


58 




J** 


46 


43 


78 


14 


N 


55 




y| 1 


55 


SO 


78 


16 


liV 


61 




y| C 
4b 


59 


54 


80 


22 


A A S' 






-83 


-89 


-88 


-73 


-73 




•49 


-41 


-75 


-76 


-76 


-66 


-43 


DIH 


-67 


-72 


-93 


-88 


-88 


-77 






-07 


-24 


-19 


-10 


-04 


07 


-07 


a 


-04 


-20 


-12 


-01 


05 


09 


-06 


E% 


•13 


-30 


-23 


-11 


-06 


OS 


-11 


CUPS 


54 


33 


62 


73 


67 


72 


34 


FBU 


65 


42 


35 


47 


44 


80 


22 


INFO 


61 




40 


55 


49' 


79 


19 


INST 


60 


27 


28 


45 


33 


57 


16 



Decimal points have been omitted from coefficients for clarity. 
With 10 degrees of freedom: r > .708, p ^ .oi 

r > .576, p < .'o5_ 
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Of particular concern in both tables 3 arid 4 are the generally large 
coefficients associated with the TA index, TA, representing the total actions 
or total number of responses comprising a task, correlates positively and highly 
significantly (p < ,01)' with all time scores. Although the coefficients are 
generally smaller, TA also exhibits a strong relationship with error scores 
(table 4). By themselves, these relationships are of trivial interest. They 
simply reflect the fact that the longer a task is, the more time, will be re- 
quired for its performance and the more potential errors there will be. What 
is disturbing, however, is that the relationships between the other indices and 
the performance criteria nay arise because of dependencies between the remaining 
indices and TA. 

During construction of the various trainers, concern arose over this very 
point. As previously mentioned, it was extremely difficult to manipulate many 
of the indices completely independently of Tk. Examination of the task index 
intercorrelation matrix (not shown) confirms this impression. TA correlates 
significantly with all other task indices (p < .01), with the exception of D%, 
C%, and E%. With respect to the basic criterion data, therefore, it is unclear 
to what extent the other indices themselves relate to the criteria or simply 
mirror TA's relationships. 

In an attempt to minimize potential contamination due to TA's influence, 
acquisition time and error scores were transformed prior to further analysis. 
The data selected for treatment were from the first*, fourth, and seventh trial 
blocks, these points being chosen to represent performance at early, intermedi- 
ate, and later stages of acquisition. Time and error data sets for each of the 
three trial blocks were treated separately. For each data set, single variable 
regression analyses were conducted using TA as the independent or predictor 
variable* This procedure resulted in sets of residual criterion scores from 
which all variance related to TA had been removed. The residual scores were 
simply the difference between the observed raw score values and the values pre- 
dicted by the TA variable. 

Evidence that the residualizing procedure had its intended effect comes 
from two sources. First, correlations between TA and the residual scores are 
zero. Second, correlations between the other (16) task indices and the resid- 
ual criteria are greatly reduced. The only significant correlation is between 
E% and performance time at the first block (r « -.58, p < .05). Relationships 
among the predictor task index variables are, of course, undisturbed by the 
adjustment procedure. TA is no longer included in this set and appears in none 
of the regression analyses described below. 

Six separate regression analyses were performed, one for each of the three, 
time and three error criterion data sets. A step-wise (step-up) regression 
procedure was employed with a maximum of four predictor variables being fitted. 
Standard values were en5)loyed for the F-level criteria for predictor variable 
inclusion or deletion. ITie results of the six analyses are summarized in table 
5. For each analysis, denoted by type of criterion, the multiple correlation 
coefficient (R) is reported together with the percentage of variance in the 
criterion accounted for (R^). Also provided are the degi^ees of freedom (df) 
used in testing the significance of R and the resultant F-value. Finally, the 
specific indices included in each regression solution are listed. They appear 
from left to right in. the order in which they were entered by the step-wise 
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T?uc ?S^12^ MULTIPLE REGRESSION ANALYSES OF PERFORMANCE 
TIME AND NUMBER OF ERRORS FOR FIRST, MIDDLE, AND LAST 
BLOCK OF ACQUISITION TRIALS 



Criterion R 


^_r2_ 


df^ F 


Indices in order of 
selection by step-wise 
regression program 


Tine Scores 


Tj_2 '780 


.608 


3, 8 4.69* 


E%, AA%, D% 


-744 


.SS3 


3, 8 - 3.30 


E%, AA%, DISP 


^13-15 -^26 


.392 


3, 8 1.72 


AA%, C%, DISP 


Error Scores 


Tl_2 .651 


.423 


3, 8 1.96 


E%, C%, D% 




.802 


3, 8 10.80** 


AA%, MAIN, D% 




.766 


3, 8 8.73** 


AA%, CONT, DEI 


*p <.05. 








**p <.01. 








^Sample size (N) 


= df^ ♦ 


df^.l. 
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, procedure. Only three indices are shown even though in all cases four were 
fitted. The small sample size (N 12) suggested a conservative approach to 
description of the predictor indices. 

As shown in table 5, when the effects upon performance time due to (TA) 
number of responses are removed, a significant multiple correlation between 
task indices and time is found only during the vexy early stages of acquisition 
(R » .780, p < *05). The relationship is between mean performance time and E%, 
AA%, and D%. The first and last of these indices reflect the extent to which 
superfluous equipment elements, especially displays, utt encountered during task 
performance. One interpretation is that extraneous equipment has a distracting 
value v^ich initially retards performance time, but whose impact decreases as 
the trainee masters the figure-ground (task -configuration) distinction. In 
line with this hypothesis, only 1% is entered into the solution at Ty.g* while 
neither E% nor D% is entered at Tis^is* Also consistent with this same idea 
the zero-order correlations of El and D% with residual time scores are nega- 
tive and decrease over trial blocks. [For E\, r » -.58, -.49^ and -.SO; for 
D%, r » -.52, -•45, and -.29.] 

As shown in table 5, a complementary situation exists with respect to 
relationships between task indices and error scores. That is, no relationship 
exists early during acquisition, but strong relationships emerge toward the 
end of training. By the middle of training, AA%, MAIN, and D% are signifi- ' 
cantly correlated with the mean number of errors being made (R « ,896, p < .01). 
AA%, MAIN, and D% individually, however, have non-significant zero-order corre- 
lations with residual error scores at this time point (i.e., r « -.57, -.08, 
-.42). During the final block of trials the relation between indices and error 
scores is still significant (R » .875, p < .01} • The mixture of related indices 
has changed, however. MAIN and D% have been replaced by CONT and DEI, while 
AA% is still present, as it is in five of the six aiiaiyses. The zero-order 
correlations of AA%, CONT, and DEI with residual errors are r « -.55, -.10, and 
-.50 respectively. 

More generally, both sets of data show that task indices of the type em- 
ployed in the present study can be related to learning or performance criteria. 
The strength of the obtained relationships suggests that it may be possible to 
use task index information to predict training criterion levels. 

TRANSFER. The basic criterion data for the pilot transfer study are shown in 
figures 10 and 11. In each case either mean performance time (figure 10) or 
mean number of errors (figure 11) is plotted as a function of trial block with 
task configuration used during acquisition as the parameter. Thq ten transfer 
trials actually administered have been collapsed into five blocks. Therefore, 
each point in these figures represents an average value for ten scores. 

In both figures the results are expressed in terms of performance on the 
medium-all task. In each case six different plots are shown. TWo of these 
are used as frames of reference. The first portrays performance of the medium- 
all group during the first portion (trials 1 to 10) of the acquisition session. 
The second plot shows the performance of this same group during the later, 
transfer session. All groups rested for one-half hour between acquisition and 
transfer sessions. The remaining four plots portray performance on the medium- ^ 
ali task during the transfer session, after practice was given on interpolated 
tasks during acquisition. 



29 



NAVTRADEVCEN 71-C-0059-1 




1 ~J 1 i L_. 

*^1.2 lv4 '''s-6 "^7-8 *^9.10 

Blocks of Trials 

f-igure 10. Mean performunce time «» a function of trial block 
during transfer to medium-all task. 
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Figure 11. Mean errors as a function of trial block during transfer 

to medium*all task 
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In figure 10, the Bediua-all subjects provide an extremely clean base- 
line in performance time against which the other functions may be viewed. 
Performance time for this group is apparently at asymptote and clearly rep- 
resents an improvement over the times achieved during acquisition. The inter- 
polated task groups show a slight reduction in performance time during transfer, 
but across all blocks are slower than the medium-all (transfer) group (p < .OS). 
Even more interesting, perhaps, is the fact that the interpolated groups are 
significantly faster than the medium-all acquisition group only at the first 
two blocks (p <.0S). Thereafter the interpolated task and medium-all acquisi- 
tion data are indistinguishable. This is in spite of the fact that the inter- 
plated groups have, by the third block, had 3.S times as much practice on set- 
up consoles. 

The breakout due to embedding which opcurs during acquisition is not ob- 
tained in the transfer time data. Furthermore, there is only the barest hint . 
of a difference in performance time during transfer due to interpolated task 
type. 

The error data shown in figure 11 show a slightly different set of re- 
lationships. The baseline mean number of errors for the medium-all group is 
somewhat variable, though approaching what appears to be an asympcote. Again, 
there clearly are lower numbers of errors made by this group during transfer 
than during acquisition. As in figure 10, there is no suggestion of an effect 
on errors made due to task embedding. 

Particularly noteworthy, however, is the evidence for a task-type effect 
upon error scores which was not so clearly seen in the time data. The simple- 
none tasks have significantly fewer mean errors than the medium-all acquisition 
group only at the first block (p < .05), Significantly fewer mean errors are 
associated with simple-all tasks, relative to the raedi'im-all acquisition group, 
on all but the last block of trials (p < .OS). Conversely, the simple-all 
groups have significantly fewer errors than the simple-none task across the 
first three blocks of trials (p < .OS). 

Considered jointly, the pilot data presented in both figures suggest that 
the simple-all subjects can perform well during transfer with respect to accu- 
racy but that they pay a price in terms of speed. On the other hand, groups 
which were trained on more dissimilar trainers (simple-none groups) pay a 
price in terms of both speed and accuracy. 

FIELD FINDINGS 

The basic ratio estimation data obtained during the field study are shown 
in Appendix G. In each of four tables, representing the set-up, detection, 
localization, and classification subtasks, four criterion estimates are shown 
across training devices. Each datun represents the mean of instructors' con- 
sensus magnitude estimates relative to the values assigned to the standards 
for comparison. These standard values were arbitrarily set at 100, SO, SO, 
and 100 for the four types of criteria. 

In any of the tables comprising Appendix G the first striking feature of 
the data is the difference in values across columns. TTiis is, of course, pri- 
marily due to the use of different standards of comparison (i.e., 100, SO, SO, 
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100). The estimation data within any" column, however, do show appreciable 
variability. On the set-i.^^ :ask (table G-1), for example, the first and 
fourth scales have ranges of SO-390 and SO-260, respectively. Although not 
as extreme, the second and third scales also show good variance. Finally, on 
all scales, mean estimates are obtained which lie both above and below the re- 
spective standard values. These aspects of the data suggest that the ratio 
estimation procedure which was en^loyed apparently succeeded in spreading out 
estimates across devices. As in the laboratory, reasonable variance in the 
criteria was a necessary condition for achieving any predictability. 

Two additional types of variation are of interest in these data. First, 
consider the amount of variation, within any subtask and on nny specific scale, 
for similar devices found at different locations. In many cases agreement is 
extremely good. In others it is not. On the training time scale for the set- 
up task (table G-1), for example, a fain/ large difference between 0A1283 
stacks exists. The BQR-2B and 2C stacks, however, lead to amazingly similar 
judgments. A more thorough examiiuition of these issues is underway, the de- 
tails of >,iiich are beyond the present level of analysis. 

Another interesting variation is seen when one focuses on a specific de- 
vice and scale, and then looks across subtasks. But before subtasks can be 
conqjared, any differences between the standard task examples have to be re- 
moved. Toward this end, instructors in the present study scaled the detection, 
localization, and classification standards relative to the set-up standards. 
Based upon these data, averaged across all instructors, a set of weights was 
derived for each subtask. The weights for the first two criteria are shown at 
the bottom of the tables in Appendix G for each subtask. Using these weights, 
for example, one would conclude that classification training time on the 14E3 
IS almost seven times longer (212 x 1.81 = 384) than localization training 
(49 X 1 . 13 = 55) . Since conqiarisons of this type were of interest in the 
present study, weighted consensus scores were used in all subsequent analyses. 
Use of these transformed estimates also made a number of combinatory analyses 
possible. 

In Appendix H, zero-order, product -mcment correlation coefficients are 
shown in separate tables for each of the four subtasks. The coefficients de- 
scribe the relation between task indices and criterion estimates. Two features 
of the data are of interest. First, significant relationships between individ- 
ual criteria and indices are obtained and cut across all four subtasks- Second, 
for the most part, when a task index exhibits a significant correlation with 
one criterion, its correlations with the remaining criteria also tend to be 
strong if not always significant. The redundancy among criteria implied by 
this observation is confirmed when the intercorrelations among criteria are 
exa&imed. In all four subtasks, the correlations between estimated training 
time and task difficulty range between r = .96 an^. r = .84. Those for profi- 
ciency level and transfer lie between r = .92 and r = .96. The correlations 
between training time and proficiency level estimates, while still significant, 
tend to be somewhat lower (i.e., r « -.67 to r » -.89). Because of this smaller 
redundancy, and because these two estimates were in a sense analogous to crite- 
ria employed in the laboratory, they alone were chosen for analysis. In the 
following analyses (C^) denotes the training time estimate, and (C?) stands for 
the proficiency level judgment. 
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Finally in Appendix H, significant correlations are shown between the 
TA variable and the two criteria selected for analysis. TA represents the 
number of actions or responses comprising a task. In the flow charts examined 
by the instructors it was possible to convert TA rather directly and perhaps 
superficially into a concept of task length or difficulty. To reduce the im- 
pact of flow-chart length upon instructor estimates and to use data analogous 
to those analyzed in the laboratory, the regression adjustment procedure was 
used again. The and C2 data were transformed into residual scores for 
analysis, thereby reducing that portion of criterion variance associated with 
TA. Resultant correlations between the remaining 16 task indices and the re- 
sidual criterion scores were greatly reduced. 

Results of the seven distinct regression analyses performed on the train- 
ing time (Ci) and proficiency level (C2) residual data are sumarized in table 
6. The column headings l-are the same as those previously used in reporting the 
laborator^^ata (table S) . Four of the seven analyses are at the basic sub- 
task"^ level. The remaining three are combinatory ar.d examine different poolings 
of the subtasks. Set-t^ and localization are pooled because they seem to rep- 
resent cases in which the trainee interacts most directly with his stack, par- 
ticularly in making control settings and adjustments. The detection and classi- 
fication tasks are pooled because of their perceptual, signal processing flavor. 
At the highest level of analysis, all four subtasks are examined simultaneously. 

In table 6 significant relationships are shown between selected task in- 
dices and the instructor ratio estimate criteria. These relationships are ob- 
tained in spite of the highly conservative procedure of using residual scores, 
a procedure which greatly reduced the zero-order correlations between predictors 
and criteria. Significant relationships are established in all but two of the 
analyses. The multiple correlations associated with the classification and 
set-iq) tasks are not significant by conventional standards (p < .05). However, 
the fact that more than half of the variance is accounted for in the set-up 
(Cj) analysis cannot be ignored (p < .10). 

One of the most interesting features of the data shown in table 6 is that 
the patterns of indices which contribute to significance change from subtask 
to subtask and from individual subtasks to pooled subtasks. The DEI index, 
for instance, while related to both criteria in the overall analysis, does not 
fall out in the intermediate poolings. It does appear, however, at the single 
task level. Similarly, AA%, which is one of the primary indices at the inter- 
mediate level, disappears from the overall analyses. These shifting patterns 
imply that different index factors may b© required, depending upon the subtask 
under examination. 
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TABLE 6. SUMMARY OF MULTIPLE REGRESSION ANALYSES OF INSTRUCTORS' 
RATIO ESTIMATES: INDIVIDUAL SUBTASKS AND POOLED SUBTASKS 





D 

K 




df 


F 


Indices in order of selection by the 
step-wise regression program - 












All 


Tasks 




.597 


.356 


7, 


37 


2.92* 


D%, INFO, CNTG, F%, DEI, E%, CONT 




.658 


.433 


7, 


37 


4.04** 


INFO, MAIN, LV, D%, E%, CONT, DEI 












Set-up + 


Localization 




.628 


.395 


4, 


20 


3.25* 


AA%, D%, INFO, DISP 




.644 


.415 


4, 


20 


3.54* 


DISP, INFO, MAIN, C% 










Detection + 


Classification 




.765 


.584 


4, 


15 


5.28** 


F%, AA%, CNTG, E 




.358 


.128 


4, 


15 


0.55 


CONT, FBR, CNTG, MAIN 


Set-up 




.741 


.550 


3, 


9 


3.66 


E, DEI, LV 




.615 


.379 


3, 


9 


1.83 


D%, E, DEI 


Detection 




.892 


.796 


2, 


7 


13.64** 


INST, CONT 




.811 


.658 


2, 


7 


6.73* 


G, DEI- 


Localization 




.848 


.719 


3, 


8 


6.84* 


D%, CRPS, DEI 




.629 


.395 


3, 


8 


1.74 


DIvSP, MAIN, FBR 


Classification 




.569 


.324 


2, 


7 


1.67 


F%, D% 




.448 


.201 


2, 


7 


0.88 


F%, DEI 



1. Cj^ = Training time needed to achieve instructor proficiency. 

2. = Proficiency level after 2 hours* of practice on the device. 
*p < .05 

**P < .01 35 
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SECTION IV 



DISCUSSION 

In this* section the results which have been detailed in Section III are 
sununarized separately for the laboratory and the field • The significance of 
these results for task quantification and performance prediction is then dis-- 
cussed. Finally, major conclusions and inqplications for future research are 
drawn* 

PREDICTION OF SET-UP TASK SKILL ACQUISITION 

The results of the laboratory acquisition study generally showed wide var- 
iation in performance as a function of task/trainer configuration, variations 
which were at least intuitively systematic. Furthermore, the systematic' spreads 
in performance, established early in training, were generally maintained through 
out acquisition. This is particularly significant because performance tended 
to reach stable, asymptotic levels toward the end of acquisition. Finally, 
regression analysis demonstrated a substantial amount of significant correla- 
tion between the task indices and performance. 

The predictability which was obtained is all the more significant because 
the prepotent effects of total actions (TA) were statistically eliminated. 
This predictability was also obtained in spite of a number of sources of error 
variation which were not dealt with to our complete satisfaction. These in- 
cluded variations due to -subjects, variations due to the use of two experiment- 
ers, and restrictions in the ranges of some of the index values. For example, 
DEI for the field devices ranged from 10 to 500 X 10-4. in the laboratory we 
obtained a range of S to 21 X 10-4. This restriction may have accounted in 
part for the somewhat different patterns of predictors which emerged from the 
step-wise regressions for laboratory and field. More comparable ranges of in- 
dex values may have increased the correspondence among the predictors. 

The predictability obtained gains further significance because of its 
presence (in some sense) throughout acquisition; i.e., ability to predict per- 
formance from task indices was more than a Block 1 phenomenon. Moreover, there 
was some, though not perfect, consistency in the patterns of predictors Which 
emerged over time; E%, AA%, and D%, for example, were selected by the step- 
wise program at more than one block. 

But, while predictability was possible throughout acquisition the relation- 
ship between type of predictability and phase of training was not a simple one. 
A significant multiple R was .obtained early in training using the time crite- 
rion, but later in acquisition, significance was obtained with the error cri- 
terion. A possible explanation for this pattern of modes of predictability is 
that all the devices were equally error prone on Block 1 (i.e., T, j)* but that 
differential elimination of errors occurred by Block 7 (i.e., Tj^^jr) . Differ- 
ential elimination of time effects is also possible, of course, but appears 
less likely. It was apparent to the experimenters during data collection that 
on more complex devices, subjects tended to rush through long sequences of 
calibration type responses with attendant carelessness in setting controls or 
reading displays. 
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The results of the acquisition study have a number of implications.*. 
First, they support the feasibility of differentiating set-up performance on 
sonar type stacks by manipulating panel design. Such differentiation is 
critical if any predictability from task indices is going to be possible* 
They suggest further that it is, in fact, possible to relate such' performance 
explicitly to design parameters, even when those parameters are purged of 
effects of variables which are prepotent, but of trivial interest. ^ 

The implication of removing TA, eliminating most of the zero-order corre- 
lation, and still obtaining significant multiple correlations is that the mul- 
tivariate approach is essential; i.e., individual task indices did not appear 
capable of predicting performance on our training devices. Rather, collections 
of indices, with perhaps specific, but as yet, unidentified patterns of features 
are crucial. Moreover, there is some hint in the results that these patterns 
may depend upon training stage, though some indices did appear to occur rather 
often . 

In addition to implying that predictor patterns may vary with stage of 
training, the results also imply that criterion patterns may be similarly in- 
fluenced, thus, the designer may have to ask-^ot whether indices relate to 
training effectiveness, but what patterns of indices relate to what criterion 
of effectiveness at what stage of training. This is a question which the 
present research cannot answer. 

An interesting sidelight is provided by the ordering effect due to embed- 
dedness. This effect implies that there may be value in using overlays to 
train the set-up task, much in the same way that overlays are used to teach 
anatomy or to facilitate the performance of an assembly line, electronics in- 
spector. Through such a device, small sets of related details can be presented, 
while other immediately unrelated details are held back temporarily. Given 
that embeddedness does, in fact, substantially retard training of set-up tasks, 
it would be of interest to determine whether the use of successive overlays can 
improve this training. 

PREDICTION OF SET-UP TASK TRANSFER OF TRAINING 

The transfer study was a pilot effort to relate some specific design vari- 
ations to transfer of training from a "simple" device to a more complex device. 
This might have a very approximate parallelism to training on a synthetic de- 
vice in the school setting and then going to a specific stack in the field. 

The particular configurations which were used reflected increasing values 
of embeddedness, as reflected in DEI, and increasing numbers of total actions 
needed to complete the task, as reflected in TA and DEI. Here TA was not a 
trivial variable because, unlike the case for acquisition, TA (on the acquisi- 
tion task) did not directly affect performance on the task being measured (the 
transfer task) . 

The results were encouraging because there was an intuitively systematic 
ordering of the configurations on Block Ti^2 of the transfer session. While 
true for both speed and accuracy of ^performance, it was particularly striking 
in the latter case. Error was proportional to the distance (along a similar- 
ity dimension) between interpolated and transfer tasks. This ordering was 
supported by correlational analysis of DEI and TA which showed significance 

* 

37 



NAVTRADEVCEN 71-C-0OS9-1 



at p < .05. These results suggest that it may very well be feasible to pre- 
dict transfer effects from quantitative indices. Emphasis, of course, has 
to be placed on the word .suggest , since we intended here only to obtain some 
pilot information. The small number of cases can, at best, only provide en- 
couragement for pursuing this line of investigation in a more rigorous fashion. 

With respect to the time ct'iterion, results showed that the interpolated 
groups (i.e., the "simple" groups) never caught up with the group originally 
trained on the medium-all device. They lagged in perfonnance speed during the 
entire transfer session. This suggests that operators trained on synthetic 
devices and then transferred to field stacks might pay a price in speed, which 
is not readily mitigated, though conceivably they could attain a satisfactory 
level of accuracy. Herein lies another very interesting and pragmatic line of 
investigation . 

PREDICTION OF JUDGED TRAINING EFFECTIVENESS 

The extent of correspondence between results obtained in the laboratory 
and results obtained in the field was substantial and very encouraging. This 
is particularly so, given the expected softness of the field estimation data. 

First, the success obtained in the laboratory in generating performance 
variation across devices was continued in the field. Mean estimates of trainer 
"effectiveness" showed wide dispersion across the 13 field stacks which were 
studied. In particular, the set-up and localization tasks generated wide var- 
iance. While performance variability was not as great for detection and class- 
ification, it was nonetheless substantial. 

When the ratio estimate data were scaled for effects of standard task dif- 
ficulty, it was seen that there was considerable variation across subtasks, 
within devices as well as across devices. This finding, coupled with results 
of the regression analysis, supports the contention that device effectiveness 
may depend very heavily on the manner in which the device is used. That is, 
it may not be feasible to talk about the effectiveness of a training device in 
generic (i.e., figure of merit) terms, but only in terms of the use to which 
the device is being put (i.e., in training specific subtasks). Thus, we can 
extend a prescription stated earlier— the designer may have to ask what pat- 
terns of indices relate to what criteria of effectiveness at what stage of 
training for which subtask. 

Second, correlational support for the indices was obtained. Relationships 
between task indices and judged device "effectiveness" wore demonstrated, 
though not for all the subtasks. Unfortunately, one of the subtasks not in- 
cluded among the significant correlations — sot-up — was of primary importance 
here, since it provided the only generalization test of successful prediction 
in the laboratory. Therefore, comparisons between the field «nd lalwr.-itory 
must be made with due caution. 

Some correspondence was obtained between the patterns of indices which 
were selected by the regression program for the field data and for the labora- 
tory data. DEI and D% were common to and prominent in both field and labora- 
tory set-up. The E index was selected by the regression program for field set- 
up, while E% was selected for the laboratory set-up. The AA* index was 
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prominent in the laboratory, but did not appear at all for field set-up. The 
discrepancies may have in part occurred because many of the indices are dif- 
ficult to reflect in the types of flow chart used in the field. The E% index, 
for example, directly measures the amount of ''clutter on a panel''; i^e,, dis- 
plays and controls which are not used for a particular subtask. Though the 
instructors were thoroughly familiar with their devices, they did work from 
flow charts rather than from the actual device and may not have considered such 
factors as E%. The AA% index reflects looping behavior, and similarly may not 
have been fully appreciable from the flow charts. In view of these methodolog- 
ical problems, it was gratifying to find as much correspondence as did appear* 

While it was possible to demonstrate validity for patterns of predictors, 
those patterns were not uniform across subtasks just as the patterns were not 
entirely uniform across time in the laboratory work. For example, while DEI 
entered into the regression for both localization and detection, it entered for 
different criteria. And other than DEI there was no index which was common to 
both localization and detection. When either of these subtasks was pooled, an 
almost completely different pattern of predictors emerged. Somewhat consistent 
with the laboratory data and the statement made earlier concerning variability 
across subtasks, these facts would appear to indicate that different patterns 
of predictors; i.e., different "factors" are -needed depending upon the partic- 
ular use to which the trainer will be put* They also suggest the possible 
fruitfulness of a factor analysis of the predictor and criterion data. 

CONCLUSIONS AND IMPLICATIONS FOR RESEARCH 

The current research effort has supported the feasibility of relating quan- 
titative indices of equipment design to performance, at least for the restricted 
set of indices and trainer stacks examined in the present study. The effort has 
also supported the feasibility of predicting transfer effects from equipment de- 
sign indices* Moreover, this predictability has been demonstrated for sets of 
indices corrected for the effects of a prepotent but trivial factor and has been 
shown to be more than a transient event. 

Clarification of the specific meaning of the data and development of a 
practical methodology require further research. In general, the following 
efforts appear warranted: 

a. Results of the present laboratory and field work need to be 
generalized to other classes of trainers. This includes de- 
termining the applicability of the current set of indices to 
other devices. An attempt is also required, if. possible, to 
validate the laboratory results in the field based on trainee 
performance data* 

b. Judgments obtained in the field via opinion sampling should be 
validated against actual performance measurement. 

c. Relationships between quantitative indices and transfer of train- 
ing require more rigorous investigation — at least in the labot^a- 
tory, but preferably under field conditions also. 
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Effects of individual differences and conditions of train- 
ing need to be interwoven with effects of training task 
variation. 

Subsequent phases of this program will deal with one or more of the issues 
raised above. 
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APPENDIX A 



SQS-26CX Set-up Subtask Operations Flow Chart 
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APPENDIX B 

Task Characteristic Index Values for Trainee 
Subtasks Evaluated in the Pi^H 
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TABLE B- 1 . TASK CHARACTERISTIC INDEX 
VALUES FOR SONAR SET-UP TASKS 



Device 








Task indices 










MAIN 


CNTG 


TA 


CONT 


DISP 


E 


LV 


AA% 




21BSS 
0A1283 


: 24 


15 


39 


16 


6 


22 


2500.2 


62 


94 


21A39/2 
0A1283 


48 


12 


60 


17 


6 


23 


2927.8 


46 


86 


2;B55 
BQR-2B 


27 


6 


33 


15 


10 


25 


2616.7 


78 


95 


21A39/2 
BQR-2C 


26 


9 


35 


11 


11 


22 


2399.9 


67 


82 


21A39/2 
8QR-7B 


ou 


17 


47 




13 


27 


3149.9 


66 


62 


14A2C1 
SQS-23B 


25 


4 


29 


1 1 
1/ 


7 


24 


2466.7 


84 


100 


X14A2 

SQS-23 (TRAM) 


17 


12 


29 


15 


5 


20 


2040.0 


69 


92 


14E3 
SOS-4 


23 


17 


40 


15 


10 


25 




O/ 


yi» 


i4E14 
SQS-4 


26 


14 


40 


15 


11 


26 


2933.6 


72 


97 


SQS-26CX 


24 


24 


48 


20 


8 


28 


2799.8 


57 


95 


14B10 
AQS-13 


41 


5 


46 


19 


6 


25 


2731.6 


58 


80 


14B31B 
AQA-i 


51 


38 


89 


18 


S 


23 


3680.3 


39 


93 


14B31B 
ASA-2C 


112 


7 


119 


22 


5 


27 


4369.9 


34 


67 
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TABLE B-1 . TASK CHARACTERISTIC INDEX 
VALUES FOR SONAR SET-UP TASKS (Cont) 











Task indices 








■ Device 


DEI 
xlO-'*v 


D% 


C% 


E% 


CRPS 


FBR 


INFO 


INST 


21B55 
0A1283 


23. ?0 


67 


84 


79 


19 


10 


10 


0 


21A39/2 
0A1283 


26.84 


67 


85 


/if 


V ' 


24 


6 


1 


21B55 
BQR-2B 


25.07 


62 


65 


54 


17 


10 


6 


0 


21A39/2 
BQR-2C 


65.53 


85 


61 


71 


14 


10 


11 


0 


21A39/2 
BQR-7B 


25.23 


81 


58 


68 


19 


10 


18 


0 


14A2/C1 
S0S-23B 


32.23 


33 


65 


51 


18 


6 


5 


4 


X14A2 

SOS-23 flRAMl 


16.51 


28 


54 


44 


15 


6 


8 


4 


14 E3 
SQS-4 


35.98 


83 


75 


78 


18 


12 


10 


4 


14E14 
SQS-4 


37.57 


85 


58 


67 


17 


11 


12 


2 


SQS'26CX 


03.78 


29 


63 


47 


20 


15 


13 


2 


14E10 
AQS-13 


14.43 


67 


83- 


78 


27 


9 


10 


5 


14B31B 
AQA-1 


10.04 


83 


62 


66 


42 


31 


16 


2 


14B31B 
ASA-20 


25.20 


45 


100 


82 


72 


32 


15 


1 
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TABLE B-2. TASK CHARACTEIMSTIC INDEX 
VALUES FOR DETECTION TASKS 
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TABLE B-2. TASK CHARACTERISTIC INDEX 
VALUES FOR DETECTION TASKS (Cont) 



Device 


DHl 
xlO-^ 




C% 


Task indices 
E% CRPS 


FBR 


INFO 


INST 


21B55 
OA 1283 


234.04 


33 


15 


27 


12 


3 


10 


6 


21B55 
BQR-2B 


62.26 


31 


8 


17 


7 


5 


18 


1 


21A39/2 
BQR-2C 


48.21 


38 


11 


23 


4 


3 


11 


0 


21A39/2 
BQR-7B 


61.34 


31 


8 


17 


5 


3 


14 


0 


14A2/C1 
SQS-23B 


28.68 


20 


12 


15 


6 


4 


10 


0 


SQS-23 (TRAMl 


55.46 


22 


7 


13 


7 


5 


13 


0 


14E3 
SQS-4 


61.28 


17 


15 


16 


3 


2 


2 


0 


14E14 
SQS-4 


29.73 


23 


12 


IS 


7 


5 


13 


0 


SQS-26CX 


594.23 


11 


6 


8 


2 


0 


5 


0 


14E10 
AQS-13 


20.66 


22 


29 


27 


9 


3 


8 


0 
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TABLE B-3. TASK CHARACTERISTIC INDEX 
VALUES FOR LOCALIZATION TASKS 



Devi CO 


MAIN 


CNTG 


TA 


Task 
. CONT 


indices 

nisp E 


■ LV 


AA% 


n 


21H55 
OA I 285 


25 


10 


35 


5 


5 


10 


2116.2 


56 


51 


- 2JB55 
BQR-2B 


36 


8 


44 


8 


8 


16 


2155.7 


45 


60 


21A5iV2 
BQR-2C 


19 


12 


31 


4 


5 


9 


1263.4 


31 


32 


21A39/2 
B0R-7B 


6 


9 


15 


2 


5 


7 


600 0 






l'l^2/c\ 


11 


20 


31 


6 


4 


10 


1395 5 




Of 


X14A2 

SQS-23 (TRAM) 


11 


0 


11 


4 


4 


8 


700.0 


58 


91 


141-3 
SQS-4 


S 


2 


7 


3 


2 


5 


399.9 


43 


50 


141:14 
SQS-4 


14 


4 


18 


5 


4 


9 


1000.0 


49 


100 


SQS-26CX 


14 


9 


23 


5 


6 


11 


. 1414.4 


57 


92 


14i:jo 

AQS-13 


17 


7 


24 


6 


4 


10 


1334.4 


50 


75 


14B31B 
AQA-1 


36 


21 


57 


7 


6 


13 


2481.3 


38 


60 


14B31B . 
ASA-20 


30 


14 


44 


n 


6 


17 


2224.5 


47 


58 
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TABLE B-3. TASK CHARACTERISTIC INDEX 
VALUES FOR LOCALIZATION TASKS (Cont) 



Device 


Di;i 

XIO-'I 


1)% 


a 


Task indices 
1:% CRPS 


FBR 


INFO 


INST 


21B55 
0A1283 


111.69 


56 


26 


36 


14 


8 


13 


. 5 


21B55 
BQR-2B 


37.28 


50 


33 


40 


18 


7 


19 


0 ■ 


21A39/2 
BQR-2C 


25.85 


38 


22 


29 


10 


4 


17 


0 


21A39/2^' 
BQR-7B 


43.80 


31 


8 


17 


3 




q 


A 
U 


14A2/C1 
SQS-23B 






ZO 


22 


14 


O 


11 


0 


X14A2 

SQS-23 (TRAM) 


JO . 44 




1 A 

14 


17 


4 


2 


5 


0 


141:3 
SQS-4 


48.10 


17 


15 


17 


3 


2 


2 


0 


141:14 
SQS-4 


27.38 


31 


19 


23- 


6 


4 


8 


0 


SQS-26CX 


37.ro 


21 


16 


19 


9 


5 


9 


0 


14E10 
AQS-13 


26.60 


44 


26 


31 


8 


7 


9 


0 


14B31B 
^QA-l 


15.56 


100 


24 


37 


21 


9 


27 


0 


14B31B 
ASA-20 


17.73 


55 


50 


52 


19 


10 


15 


0 
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TABLE B-4. TASK CHARACTERISTIC INDEX 
VALUES FOR CLASSIFICATION TASKS 



i)cvi cc 








Task 


indicos 










MAIN 


CK'ix; 


TA 


CONT 




E 


I.V 


AAI. 




2Jli55 
OA1283 


6 


0 


6 


2 


2 


4 


400.0 


50 


75 


21 us:; 

BQU-21'. 


6 


0 


6 


2 


2 


4 


300.0 


25 


75 


21A39/2 
nQR-2C 


8 


0 


8 


2 




0 


500.0 


53 


100 


2IA:<<)/2 
BQR-7B 


12 


0 


12 


3 


3 


6 


850.0 


64 


50 


14A2/(:j 
.SQS-2:SB 


12 


3 


IS 


2 


2 


4 


1120.0 


62 


61 


XI4A2 


10 


11 


21 


6 


4 


10 


1133.3 


48 


52 






















SQS-'l 


13 


6 


19 


4 


4 


8 


1016.8 


46 


91 


141U4 


5 


0 


5 


1 


2 


3 


. -400.0 


60 


26 


SQS-26CX 


15 


5 


20 


2 


4 


6 


1311.3 


57 


68 


I'lJiJO 


8 


1 
















AQ.S--i:4 


9 


4 


2 


6 


466.7 


40 


100 
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TABLE B-4. .TASK CHARACTERISTIC INDEX 
VALUES FOR CLASSIFICATION TASKS (Cont) 



3evicc 


DI-I 

xlO-'i 


■ I)% 


t.'O 


Task indices 
1-% CRPS 


FBR 


INl-0 


INST 


21B55 
0A1283 


273.86 


22 


10 


14 


2 


0 


4 


0 


21B55 
BQR-2B 


49.93 


12 


8 


10 


2 


0 


4 


0 


21A39/2 
BQR-2C 


48.04 


. 31 


11 


19 


2 


0. 


6 


0 


21A39/2 
BQR-7B 


123.40 


19 


12 


15 


4 


0 


8 


0 


14A2/C1 
SQS-23B 


91.51 


10 


8 


9 


7 


0 


8 


0 


X14A2 

SQS-23 (TRAM) 


18.75 


22 


21 


22 


7 


5 


9 


0 


14E3 
SQS-4 


3.60 


33 


20 


25 


7 


3 


9 


0 


14E14 
SQS-4 


342.33 


15 


4 


8 


1 


0 


4 


0 


SQS-26CX 


16.50 


14 


6 


10 


4 


3 


13 


0 


14 RIO 
AQS-13 


26.70 


22 


17 


18 


4 


1 


4 


0 
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APPENDIX C 



Operations Flow Chart for the Complex-All 
Synthetic Set-up Task 
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Instructions for Magnitude Estimates 
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MAGNITUDE ESTIMATES 
What we would like you to do today is to make four judgments about each 
sonar subtask (the comparison task) which is performed on your device. 
The four types of judgments are described below, together with some prac- 
tice examples. 



1. Relative to the case of the standard sonar subtask, how many 
more or how many fewer units of practice would tlie average A-school 
trainee need on the comparison task in order to perform it as quick- 
ly and as accurately as the typical instructor? 

Iln order to do this in the case of the standard task, 100 units 
of practice are required.] 



2. Relative to the case of the standard sonar subtask on which 
2 hours of practice was given, how much better or worse would 
the average A-school trainee perform the comparison task after 
the same amount of practice? 

[Performance on tiie standard task after 2 hours of practice is 
at the 50 level.] 



3. If the degree of transfer of training from the training situ- 
ation to the operational situation is 50 on the standard task, 
how much greater or less than 50 is it on the comparison task? 

[The degree of transfer of training on the standard task is 50.] 



4. Relative to the case of the standard sonar subtask, how much 
more or less difficult would it be for the average A-school trainee 
to learn to perform the comparison task? 

(The difficulty in learning to perform the standard subtask is 100.] 
Below are four practice examples. Please complete tlicm now. 

1. With respect to the first type of estimate described above. If you thought 
your task required 2.5 times as much practice as the standard, what value 
«vould you assign? 



respect to the second type of estimate, if you thought trainees 
would perform only one-third as well, what value would you assign? 
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3. 

the 



fn the third type of estimate, what value would you assign if you thought 
degree of transfer of training on the comparison task was: 



Twice as great relative to the standard 



9 



Half as great relative to the standard 



9 



At the same level as in the standard case 



9 



4. In the final type of estimate, what value would you assign if you thought 
the comparison' task was 1-1/2 times more difficult than the standard? 



In making the estimates, remember: 

• Think in terms of our task descriptions rather than in terms of 
how you do or teach the task. 

• Make your judgments with respect to the overall task. 

• Remember to assign a value to your judgment which is some fraction 
or multiple "times" the value of the standard. 
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APPENDIX. E 
Subtask Standard Flow Charts 
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APPENDIX F 
Ratio Estimate Answer Sheet 



71 



NAVTRADEVCEN 71-C-00S9-1 



Name 



Task 



Type of Estimate Standard Value 

1. Relative to the case of the lOO 

standard sonar subtask, how many 

more or how many fewer units of 

practice would the average A- school 

trainee need on the comparison task 

in order to perform it as quickly and 

as accurately as the typical instruc- 

tor? 



2. Relative to the case of the 
standard sonar subtask on which 
2 hours of practice was given, 
how much better or worse would 
the average A-school trainee per- 
form the comparison task after the 
same amount of practice? 

3. If the degree of transfer of 
training from th^ craining situ- 
ation to the operational situation 
is 50 o.i the standard task, how 
mucn greater or less than 50 is 

it on the comparison ta.sk? 

4. Relative to the case of the 
standard sonar subtask, how much 
more or less difficult would it 

be for the average A-school trainee 
to learn to perform the comparison 
task? 



SO 



50 



100 
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APPENDIX G 

Mean Instructor Ratio Estimates for the Four Subtasks 
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TABLE G-1. MEAN INSTRUCTC^i RATIO 
ESTIMATES FOR THE SET-UP TASK 



Device 


Training 
.Time 


Criteria 

Proficiency „ 

Level l^ansfer 


Tft^k 

Difficulty 


21B55 
0A1283 


.136 


42 


43 


126 


21A39/2 
CA1283 


225 


22 


22 


225 • 


21B55 
BQR-2B 


87 


57 


60 


91 


21A39/2 
BQR-2C 


90 


55 


54 


93 


2JA39/2 
BQR-7B 


195 


20 


20 


170 


14A2/C1 
SQS-23B 


SO 




/:> 


50 


X14A2 

SQS-23 (TRAM) 


79 


63 


61 


' 79 


SQS-4 


143 


31 


40 

1 


13b 


14iii4 
ol^b-4 


125 


38 


44 


113 


SQS-26CX 


390 


25 


27 


260 


14E10 
AQS-13 


172 


46 


4(. 


1 52 


14B31B 
AQA-1 


231 


33 


27 


156 


14B3]B 
ASA-20 


333 


20 


23 


253 


Composite 
weights: 


1.00 


1.00 
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TABLE G-2. MEAN INSTRUCTOR RATIO- 
ESTIMATES FOR THE DETECTION TASK 



Device 


Training 
Time 


Criteria 

Proficiency 

, , ' Transfer 
Level 


• 

Task 
Difficulty . 


21B55 
OA 1283 


220 






158 


21B55 
8QR-2B 


144 






I 

163 


2iA39/2 
BQR-2C 


100 - 


50 


50 


100 


2.1A39/2 
JQR-7B 


100 


50 


50 


100 


1/IA2/C1 
SQS-23B 


125 


45 


50 


119 


SQS~23 (TRAM) 


129 


34 


29 


129 


J** i^O 

SQS-4 


44 


125 


90 


63 


141-14 
SQS-4 


.125 


45 


45 


125 


SQS-20CX 


70 


90 


60 


90 


141110 
^QS-13 


170 


31 




140 


Composite 
weights: 


1 .49 


1.32 
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TABLE 6-3. -JEAN INSTRUCTOR RATIO 
ESTIMATES Fffi. THE LOCALIZATION TASK 





Training . 
Time 


Criteria 


Task 
Difficulty 


2in55 
0A1283 


200 


oo 


JO 


180 


21B55 
BQR-2B 


410 






299 


21A39/2 
BQR-2C 


133 


35 


39 


130 


21A39/2- 

DI^K— / D 


76 


61 


80 


71 


14A2/C1 


363 


19 


35 


438 


X14A2 


85 


70 


61 


83 


14E3 
SQS-4 


49 


150 


119 


4'5~ 


141:14 
SQS-4 


86 


70 


88 


SI 


SQS-26CX 


110 


55 


- 55 


105 


14E10 
AQS-I3 


92 


52 


50 


94 


14B31B 
\QA~1 


156 


29 


29 


150 


HB31B 
^SA-20 


267 


22 


2.> 


217 


Composite 
weights: 


1.13 


1.22 
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TABLE G-4. MEAN INSTRUCim RATIO 
ESTL^IATES FOR THE CLASSIFICATION TASK 



Device 


Training 
Time 


Criteria 

Proficiency - 
rovni Transfer 


Task 
Difficulty 


2IB55 
0A1283 


59 


91 


100 


66 


21B55 
BQR-2B 


87 


52 


ss 


90 


21A39/2 
BQR-2C 


100 


50 


50 


100 


2IA39/2 
BQR-7B 


100 


50 


SO 


100 


14A2/C1 
SQS-23B 


191 


31 


34 


166 


VI il AO 

SQS-23 (TOAII) 




• 21 


21 




141=3 
SQS-4 


212 


28 


29 


300 


14E14 
SQS-4 


200 


33 


54 - 


175 


SQS-26CX 


170 


41 


27 


115 


14E10 
AQS-J3 


110 


50 


50 ' 


too 


Composite 
weights: 




2.01 
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APPENDIX H 

Intercorrelations cf Task Index Values and Adjusted 
Mean Instructor Ratio Estimates 
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TABLE H-1. IffTERCORRELATIONS OF TASK INDEX VALUES 
AND MEAN INSTRUCTOR RATIO ESTIMATES— SET-UP^ 



T*>clr 
IclSK 

Indices 


Training 
Time 


Criteria 


Task 
Difficulty 






56* 


-43 


-50 


54 




CNTG 


43 


-46 


-53 


34 




TA 


69** 


-57* 


-66* 


64* 




COMT 


72** 


-30 


-39 


64* 




DISP 


-24 


-09 


01 


-19 




i: 


60* 


-47 


-46 


56* 




LV 


67* 


-62* 


-69** 


-61* 




AA% 


-76** 


70** 


78** 


-77** 




F% 


•35 


50 


54 


-42 




DHI 
xlfr* 




25 


32 


— 






-19 


-36 


-29 


-08 




a 


39 


-35 


-55 


48 




U% 


12 


-50 


. -44 


28 




CRPS 


' 61* 


-44 


-52 


55* 




FBR 


68* 


-60* 


-69** 


66* 




INFO 


56* 


-65* 


-70** 


47 




INST 


-16 


36 


35 


-21 





*p <.05 
**p <.0l 

^Decimal points have been omitted for clarity. 
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TABLE H-2. INTERCORRELATIONS OF TASK INDEX VALUES 
AND MEAN- INSTRUCTOR RATIO ESTIMATES— DETECTION*^ 



Task 
Indices 


- 


Training 
Time 


Critcri:* 

Proficiency 

Level rransfer 


Task 
Difficulty 




^lAIN 




57 


-65* 


-76** 


* 

36 




CNTG 




-52' 


-69* 


-68* 


63* 




FA 

CONT 




63* 
66* 


-77** 
-60* 


-84*^ 
-69* 


53 
52 




DISP 




59* 


-72** 


-71* 


O / 








71* 


-73** 


-78** 


52 




l.V 




63* 


-75** 


-84** 


50 








-07 


10 


-03 


-14 








-10 


06 


09 


-06 




DEI 




01 


11 


08 


-no 

"•nil 




U'i 




- 21 


-49 


-Sa* 


10 




c% 




65* 


-54 


-66* 


48 




n% 




56 ^ 


-61* 


-72** 


37 




Ckps 




74** 


-74** 


-84** 


64* 








56 


■ -70* 


_79** 


48 




INFO 




48 


-72**- 


-75** 






INST 




08 


-14 


-19 


06 




*p < 
**p < 


.05 
.01 












'Decimal points 


luivj been 


omitted for clarity 












80 


f 







NAVTRADEVCEN 71-C-0059-1 



TABLE H-3. INTERCORRELATIONS OF TASK INDEX VALUE! 
AND MEAN INSTRUCTOR RATIO ESTIMATES—LOCALIZATION 



Task 




Criteria 




Indices 


Trai ning 

'JilllC 


Proficiency 
Level 


lYansfcr 


lasK 
Difficulty 


^lA IN 


57 


-76* 


-79** 


75* 


CNTG 


^ 33 


-26 


-06 


18 




72* 


-89** 


-86** 


85** 


CONT 


^ 40 


---- -28 


-18 


27 


DISP 


-04 


-28 


-18 


11 


E 


48 


-65* j 


-42 


47 


LV 


79** 


-79** / 


-80** 


■ 89** 


AA% 


SC 


-27 


-32 


49 


i 0 


00 


00 


00 


00. 


r\i* 1 


-16 


36 


13 


-19 


t|% 


45 


-56 


-47 


42 


C% 


39 


-13 


-05 


19 




b7* 


-44 ■ 


-28 


44 


CRl'S 


97** 


-74* 


-71* 


85** 


FBR 


43 


-67* 


-60 


60 


INFO 


43 


-78** .... 


-76* 


66* 


INST 


72* 


-29 


-38 


53 



*p .05 
**p ' .01 

tpecimal points have been omitted for. clarity. 
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TABLE H-4. INTERCORRELATIONS OF TASK INDEX VALUES 
AND MEAN INSTRUCTOR RATIO ESTIMATES—CLASSIFICATION^ 



Task- 
In 'ices 


Training 
Time 


Criteria 

Proficiency n^.^^^f^.. 
Level l^ansloi 


Task 
Difficulty 




43 


-44 


-68* 


38 


CNTG 


74* 


-63* 


-71* 


I 62 


FA 


68* 


-62 




58 


CONT 


35 


-39 


-46 


4J . 


DISP 


38 


-41 


-59 


40 


E 


42 


-46 


-60 


47' 


LV ' 


65* 


-58 


-75* 


48 


AA% 


30 


-16 


. -12 


10 . 


F% 


-40 


29 


08 


-10 


DRI 


-16 


40 


67* 


-21 


D% 


-02 


04 


00 


35 


C% 


24 


-27 


-33 


47 


B% 


17 


-19 


-26 


47 


CRPS 


63* 


-60 

* 


-69* 


65* 


FBR V 


65* ^ 


-54 


-66* 


- 55 


INl-O 


52 


-48 


-71* 


36 


INST 


^ 00'^ 


GO 


00 


00 ■ 



*p .05. 
*p .01 



'Decimal points have been omitted for clarity. ; 
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